Data Handling
MGMT 675
AI-Assisted Financial Analysis
Kerry Back

Topics
- Merge
- Filter
- Sort
- Aggregate by group
- Transform
Datasets
- metrics.xlsx and tickers.xlsx.
- Online data from various sources.
Merge metrics and tickers
- Ask Julius to merge the datasets on the ticker column
- Ask Julius the following.
- How many rows are there?
- What are the column names?
- What are the unique values in the category column?
- What are the unique values in the sector column?
- Show the head of the data frame.
Filter
- Ask Julius to filter on the category column to “Domestic Common Stock” and “Domestic Common Stock Primary Class.” Ask Julius to call this data frame common_stock.
- Ask Julius to create a copy of the common_stock data frame that contains only rows for which pe \(>0\).
- Ask Julius to create a copy of the common_stock data frame that contains only rows for which marketcap is above the median marketcap.
Sort
- Ask Julius to sort on marketcap in descending order and to show the head of the data frame.
Aggregate by group
- Ask Julius to describe marketcap.
- Ask Julius to compute the mean marketcap by sector.
- Ask Julius to compute the number of firms by sector.
- Ask Julius to compute the total marketcap by sector.
- Ask Julius to compute the mean pe grouped by (sector, scalemarketcap) and to display the results as a two-dimensional table.
- Ask Julius to recreate the table using only rows for which pe \(>0\).
- Ask Julius to compute the percent of firms for which pe \(<0\) by sector.
Yahoo Finance
- Daily open, high, low, close, adjusted close, volume
- Income statement, balance sheet, and statement of cash flows for past 5 years
- Current market option data (bid, ask, last price, open interest, implied volatility, …)
- Can get with yfinance library
Yahoo’s Adjusted Closing Prices
- Yahoo’s adjusted closing prices are adjusted for splits and dividends.
- The percent change in the adjusted closing price is the daily close-to-close return including dividends.
Caveat
On ex-dividend days, the percent change in the adjusted closing price is \[\frac{P_{t}}{P_{t-1}-D_t} - 1\] rather than what we might prefer: \[\frac{P_{t} + D_t}{P_{t-1}} - 1\] but this is a minor issue (small difference 4 days a year).
Monthly, Annual, … Returns
If we want returns at a different frequency, for example annual returns, then we can either
- compound the daily returns, or
- downsample the adjusted closing prices to annual data and compute the percent change of the downsampled data.
Example
- Ask Julius to use yfinance to get adjusted closing prices for SPY for the longest history available.
- Ask Julius to downsample the prices to end-of-month.
- Ask Julius to compute monthly returns as the percent change in the downsampled prices.
Federal Reserve Economic Data
- Ask Julius to use the pandas-datareader to get the history of crude oil prices from FRED.
- Ask Julius to get the history of inflation rates from FRED.
Ken French’s Data Library
- Ask Julius to get the Fama-French factors from Ken French’s data library.
- Ask Julius to list the datasets on Ken French’s data library.
- Ask Julius to get the 48 industry returns from Ken French’s data library.
Scraping
- Ask Julius to find the constituents of the S&P 100.
- When Julius provides a link, ask Julius to read the table at the link.